Compression of Biological Sequences by Greedy Off-Line Textual Substitution
نویسندگان
چکیده
منابع مشابه
Some Theory and Practice of Greedy Off-Line Textual Substitution
Greedy off-line textual substitution refers to the following approach to compression or structural inference. Given a long textring x, a substring w is identified such that replacing all Instances of w in x except one by a suitable pair of pointers yields the highest possible contraction of Xi the process is then repeated on the contracted textstring, until substrings capable of producing contr...
متن کاملIncremental Data Compression |extended Abstract|
Data compression is used in data transmission and data storage. When data compression is used, data is transmitted faster, and le storage requires less space. Many aspects of data compression are described in Leweler and Hirschberg [6], and Storer [9]. An important technique for data compression is textual substitution. Textual substitution identi es repeated substrings and replaces some or all...
متن کاملDNA Sequence Compression Using the Burrows-Wheeler Transform
We investigate off-line dictionary oriented approaches to DNA sequence compression, based on the Burrows-Wheeler Transform (BWT). The preponderance of short repeating patterns is an important phenomenon in biological sequences. Here, we propose off-line methods to compress DNA sequences that exploit the different repetition structures inherent in such sequences. Repetition analysis is performed...
متن کاملLinear-Time Off-Line Text Compression by Longest-First Substitution
Given a text, grammar-based compression is to construct a grammar that generates the text. There are many kinds of text compression techniques of this type. Each compression scheme is categorized as being either off-line or on-line, according to how a text is processed. One representative tactics for off-line compression is to substitute the longest repeated factors of a text with a production ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000